Boosting systems for LVCSR
نویسندگان
چکیده
We employ a variant of the popular Adaboost algorithm to train multiple acoustic models such that the aggregate system exhibits improved performance over the individual recognizers. Each model is trained sequentially on re-weighted versions of the training data. At each iteration, the weights are decreased for the frames that are correctly decoded by the current system. These weights are then multiplied with the frame-level statistics for the decision trees and Gaussian mixture components of the next iteration system. The composite system uses a log-linear combination of HMM state observation likelihoods. We report experimental results on several broadcast news transcription setups which differ in the language being spoken (English and Arabic) and amounts of training data. Additionally, we study the impact of boosting on ML and discriminatively trained acoustic models. Our findings suggest that significant gains can be obtained for small amounts of training data even after feature and model-space discriminative training.
منابع مشابه
Comparative study of boosting and non-boosting training for constructing ensembles of acoustic models
This paper compares the performance of Boosting and nonBoosting training algorithms in large vocabulary continuous speech recognition (LVCSR) using ensembles of acoustic models. Both algorithms demonstrated significant word error rate reduction on the CMU Communicator corpus. However, both algorithms produced comparable improvements, even though one would expect that the Boosting algorithm, whi...
متن کاملImproving the performance of an LVCSR system through ensembles of acoustic models
This paper describes our work on applying ensembles of acoustic models to the problem of large vocabulary continuous speech recognition (LVCSR). We propose three algorithms for constructing ensembles. The first two have their roots in bagging algorithms; however, instead of randomly sampling examples our algorithms construct training sets based on the word error rate. The third one is a boostin...
متن کاملBoosting Gaussian mixtures in an LVCSR system
In this paper, we apply boosting to the problem of frame-level phone classi cation, and use the resulting system to perform voicemail transcription. We develop parallel, hierarchical, and restricted versions of the classic AdaBoost algorithm, which enable the technique to be used in large-scale speech recognition tasks with hundreds of thousands of Gaussians and tens of millions of training fra...
متن کاملCross-lingual and multi-stream posterior features for low resource LVCSR systems
We investigate approaches for large vocabulary continuous speech recognition (LVCSR) system for new languages or new domains using limited amounts of transcribed training data. In these low resource conditions, the performance of conventional LVCSR systems degrade significantly. We propose to train low resource LVCSR system with additional sources of information like annotated data from other l...
متن کاملDesign of Fast Lvcsr Systems
This paper describes the development of fast (less than 10 times real-time) large vocabulary continuous speech recognition (LVCSR) systems based on technology developed for unlimited runtime systems assembled for participation in recent DARPA/NIST LVCSR evaluations. A general system structure for 10 times real-time systems is proposed and two specific systems that have been built for Broadcast ...
متن کامل